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A modification of tlie nudged elastic band (NEB) method is presented that en- 
ables stable optimisations to be run using both the limited-memory quasi-Newton 
(L-BFGS) and slow-response quenched velocity Verlet (SQVV) minimisers. The 
performance of this new 'doubly nudged' DNEB method is analysed in conjunc- 
tion with both minimisers and compared with previous NEB formulations. We find 
that the fastest DNEB approach (DNEB/L-BFGS) can be quicker by up to two or- 
ders of magnitude. Applications to permutational rearrangements of the seven-atom 
Lennard-Jones cluster (LJ7) and highly cooperative rearrangements of LJ38 and LJ75 
are presented. We also outline an updated algorithm for constructing complicated 
multi-step pathways using successive DNEB runs. 
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I. INTRODUCTION 



Locating transition states on a potential energy surface (PES) proyides an important tool 
in the study of dynamics using statistical rate theories . Here we define a transition 

state according to the geometrical definition of Murrell and Laidler, stationary 
point with a single negative Hessian eigenvalue ^. Unfortunately, it is significantly harder 
to locate transition states than local minima, since the system must effectively 'balance 
on a knife-edge' in one degree of freedom. Many algorithms have been suggested for this 
purpose, and the most efficient method may depend upon the nature of the system. For 
example, different considerations probably apply if second derivatives can be calculated 
relatively quickly, as for many empirical potentials B]. Transformation to an alternative 
coordinate system may also be beneficial for systems bound by strongly directional forces 
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Algorithms to locate transition states can generally be divided into siiigle-ended 
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and double-ended methods, which are usuall y desig ned to find 



a transition state between two endpoints [lOl \5A [51 
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The result of a single-ended search may be a tran- 
sition state that is not connected to the starting point by a steepest-descent path, and such 
methods can be useful for building up databases of stationary points to provide a non-local 
picture of the potential energy surface, including thermodynamic and dynamic properties 
In contrast, double-ended methods are usually designed to characterise a particular 
rearrangement, and often do not produce a tightly converged geometry for the transition 
state. However, the resulting structures can be further refined using single-ended strategies. 



particularly eigenvector-following 
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this approach has been used in several previous studies 



of the field are available 0, 13], and readers are referred to these publications for further 
discussion. 

Our princi pal concern in the present work is the development of the double-ended NEB 
approach HQ HQ. The earhest doub.e-euded methods were probably the hnear a„d 

quadratic synchronous transit algorithms (LST and QST) which are entirely based on 
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interpolation between the two endpoints. In LST the highest energy structure is located 
along the straight line that links the two endpoints. QST is similar in spirit, but approx- 
imates the reaction path using a parabola instead of a straight line. Neither interpolation 
is likely to provide a good estimate of the path except for very simple reactions, but they 
may nevertheless be useful to generate initial guesses for more sophisticated double-ended 
methods. 

Another approach is to reduce the distance between reactant and product by some arbi- 
trary value to generate an 'intermediate', and seek the minimum energy of this intermediate 
structure subject to certain constraints, such as fixed distance to an endpoint. This is the 



basis of the 'Saddle' optimisation method and the 'Line Then Plane' pTl algorithm, 
which differ only in the definition of the subspace in which the intermediate is allowed to 
move. The latter method optimises the intermediate in the hyperplane perpendicular to 
the interpolation line, while 'Saddle' uses hyperspheres. The minimised intermediate then 
replaces one of the endpoints and the process is repeated. 

There are also a number of methods that are based on a 'chain-of-states' (CS) approach, 
where several images of the system are somehow coupled together to create an approximation 
to the required path. The CS methods mainly differ in the way in which the initial guess 
to the path is refined. In the 'Chain' method |i88| the geometry of the highest energy 
image is relaxed first using only the component of the gradient perpendicular to the line 
connecting its two neighbours. The process is then repeated for the next-highest energy 
neighbours. The optimisation is terminated when the gradient becomes tangential to the 



path. The 'Locally Updated Planes' method f89] is similar, but the images are relaxed in 
the hyperplane perpendicular to the reaction coordinate, rather than along the line defined 
by the gradient, and all the images are moved simultaneously. 

The nudged elastic band (NEB) approach introduced some further refinements to these 
CS methods ^. It i= b^ed on a di^catised _taticn ot the path oHginally p.opcad 
by Elber and Karplus with modifications to eliminate corner-cutting and sliding-down 
problems [^|, and to improve the stability and convergence properties [l^. Maragakis et al. 
applied the NEB method to various physical systems ranging from semiconductor materials 
to biologically relevant molecules. They report that use of powerful minimisation methods 
in conjunction with NEB approach was unsuccessful 79]. These problems were attributed 
to instabilities with respect to the extra parameters introduced by the springs. In fact. 
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the NEB approach has previously been used with the L-BFGS algorithm in other work, 
where hybrid eigenvector-following techniques were employed to produce tightly converged 
transition states from guesses obtained by NEB calculations 0, Is^. The main result 
of the present contribution is a modified 'doubly nudged' elastic band (DNEB) method, 
which is stable when combined with the L-BFGS minimiser. In comparing the DNEB 
approach with other methods we have also analysed quenched velocity Verlet minimisation, 
and determined the best point at which to remove the kinetic energy. Extensive tests 
show that the DNEB/L-BFGS combination provides a significant performance improvement 
over previous implementations. We therefore outline a new strategy to connect distant 
minima, which is based on successive DNEB searches to provide transition state candidates 
for refinement by eigenvector-following. 



II. METHODS 



In the present work we used the nudged elastic band [ill [3| (NEB) and eigenvector- 
following Q, 0, Q, Q, 0, 0, 0, 0, S S, Q 

(EF) methods for locating and re- 
fining transition states. In the NEB approach the path is represented as a set of images 
{Xi, X2...Xjv} that connect the endpoints Xq and X/yr+i, where Xj is a vector containing 
the coordinates of image i (Figure Q]) [t^. In the usual framework of double-ended method- 



ologies 



90| the endpoints are stationary points on the potential energy surface (PES) (usually 



minima), which are known in advance. In addition to the true potential, VJ, which binds 
the atoms within each image, equivalent atoms in adjacent images are interconnected by 
+ 1 springs according to a parabolic potential, 

N+l 

V = \kspr |Xj — Xj_i|^. (1) 

i=l 

Subsequently these potentials will be referred to as the 'true potential' and the 'spring 
potential', respectively. 

The springs are intended to hold images on the path during optimisation — otherwise 
they would slide down to the endpoints, or to other intermediate minima j^. Occasionally, 
depending on the quality of the initial guess, we have found that some images may converge 
to higher index stationary points. One could imagine the whole construction as a band or 
rope that is stretched across the PES, which, if optimised, is capable of closely following a 
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curve defined in terms of successive minima, transition states, and tlie intervening steepest- 
descent patlis. 

In practice, tlie above formulation encounters difficulties connected with the coupling 
between the 'true' and 'spring' components of the potential. The magnitude of the springs' 
interference with the true potential is system dependent and generally gives rise to corner- 
cutting and sliding-down problems [71] . It is convenient to discuss these difficulties in terms 
of the components of the true gradient, g, and spring gradient, g, parallel and perpendicular 
to the path. The parallel component of the gradient g" at image i on the path is obtained 
by projecting out the perpendicular component g-*- using an estimate of the tangent to the 
path. The parallel and perpendicular components for image i are: 



where Vi = V (Xj), and the unit vector Tj is the tangent. Here and throughout this paper 
we denote unit vectors by a hat. The complete gradient, g, has N x t] components for a 
band of images with rj atomic degrees of freedom each. 

Corner-cutting has a significant effect when a path experiences high curvature. Here g-*" 
is large, which prevents the images from closely following the path because the spring force 
necessarily has a significant component perpendicular to the tangent. The sliding-down 
problem occurs due to the presence of g", which perturbs the distribution of images along 
the path, creating high-resolution reg ions (around the local minima) and low-resolution 
regions (near the transition states) 71] . Both problems significantly affect the ability of the 
NEB method to produce good transition state candidates. We have found that sliding-down 
and corner-cutting are interdependent and cannot both be remedied by adjusting the spring 
force constant kgpr', increasing kspr may prevent sliding-down but it will make corner-cutting 
worse. 

The aforementioned problems can sometimes be eliminated by constructing the NEB 
gradient from the potential in the following way: g" and g^ are projected out, which gives 
the elastic band its 'nudged' property Removal of g" can be thought of as bringing the 
path into a plane or flattening the PES [Figure ^b)], while removal of g"*" is analogous to 
making the images heavier so that they favour the bottom of the valley at all times. 

The choice of a method to estimate the tangent to the path is important for it affects 
the convergence of the NEB calculation. Originally, the tangent vector, Tj, for image i 
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was o 
t - 1 



jtained by normalising the line segment between the two adjacent images, i + 1 and 

(3) 



21|: 

Xj+i — Xi_i 



However, kinks can develop during optimisation of the image chain using this definition of 
Tj. It has been shown [75] that kinks are likely to appear in the regions where the ratio 
g\ / gf is larger than the length of the line segment, |t|, used in estimating the tangent 
[Figure Efa)]. 

Both the above ratio, the image density and |t| can vary depending on the system 
of interest, the particular pathway and other parameters of the NEB calculation. From 
equation (jH} it can be seen that -fj, and, hence, the next step in the optimisation of image 
is determined by its neighbours, which are not necessarily closer to the path than image 
i. Therefore, a better approach in estimating the Tj would be to use only one neighbour, 
since then we only need this neighbour to be better converged than image i. 

There are two neighbours to select from, and it is natural to use the higher-energy one 
for this purpose, since steepest-descent paths are easier to follow downhill than uphill: 

(j - z) (X, - X.) 

where i and j are two adjacent images with energies Ei and Ej, and Ei < Ej. In this way, an 
image i that has one higher-energy neighbour j behaves as if it is 'hanging' on to it [Figure 

Hb)]. 

The above tangent formulation requires special handling of extrema along the path, and 
a mechanism for switching r at such points was proposed |73]. It also fails to produce 
an even distribution of images in regions with high curvature [Figure |2tc)]. We presume 
that Henkelman and Jonsson substitute (g ■ r) r by |g|T in equation (0) to obtain a spring 
gradient formulation that will keep the images equispaced when the tangent from equation 
dH) is used in the projections: [3| 

il' = kspr (|Xi - Xi_i| - |Xi+i - Xil) f i. (5) 

We have previously used the NEB approach to produce candidate transition state guesses 



for further refinement using hybrid EF methods 8l|,|82], which avoid either calculating the 



Hessian or diagonalising it 



Having obtained tightly converged transition states 
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approximate steepest-descent paths are calculated by energy minimisation, as discussed in 

In the present work the NEB approach has been used in combination with two minimisers, 
namely the quenched velocity Verlet (QVV) and the limited-memory Broyden-Fletcher- 
Goldfarb-Shanno (L-BFGS) algorithms. The QVV method is based on the velocity Verlet 
algorithm jo^ (VV) as modified by Jonsson et al. 3| and was originally used for NEB 
optimisation. VV is a symplectic integrator that enjoys widespread popularity, primarily in 
molecular dynamics (MD) simulations where it is used for numerical integration of Newton's 
equations of motion. At each time step 5t the coordinates and the velocities V are updated 
from the coupled first-order differential equations in the following manner 

X{t + 6t) = X (t) + 6tV (t) - — g (t) (6) 
v(t+l.t)^V(t)-|-g(t) (7) 

V{t + 6t) = v(^t+ht^-^g{t + 6t) (8) 

The algorithm involves two stages, with a force evaluation in between. First the positions 
are updated according to equation (0), and the velocities at midstep t + 5t/2 are then 
computed using equation ((7j). After the evaluation of the gradient at time t + 5t the velocity 
is updated again [equation (jHJ] to complete the move. To obtain minimisation it is necessary 
to remove kinetic energy, and this can be done in several ways. If kinetic energy is removed 
completely every step the algorithm is equivalent to a steepest-descent minimisation, which 
is rather inefficient. Instead, it was proposed by Jonsson et al. [ill to keep only the velocity 
component that is antiparallel to the gradient at the current step. If the force is consistently 
pointing in the same direction the system accelerates, which is equivalent to increasing the 
time step [^|. However, a straightforward variable time step version of the above algorithm 
was reported to be unsuccessful 9^ . 

L-BFGS is a version of the BEGS algorithm that limits the storage used and is hence 
particularly suitable for large-scale problems 3] • The difference between the L-BFGS algo- 
rithm and standard BEGS is in the Hessian matrix update. In order to store each correction 
to the Hessian 2n storage locations are needed, n being the dimensionality of the prob- 
lem L-BFGS stores a maximum of m corrections and the Hessian matrix is never 
formed explicitly. Every iteration — H~^g is computed according to a recursive formula de- 
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scribed by Nocedal '94'. For the first m iterations L-BFGS is identical to the BFGS method, 
but after these the oldest correction is discarded. Since only the m most recent corrections 
are retained L-BFGS uses less storage. Here we employed a modified version of Nocedal's 
L-BFGS implementation in which the line search was removed and the maximum step 
size was limited for each image separately. 

It is noteworthy that the objective function corresponding to the projected NEB gradient 
is unknown, but it is not actually required in either of the minimisation algorithms that we 
consider. 



III. RESULTS 

The springs should distribute the images evenly along the NEB path during the opti- 
misation, and the choice of kspr must be made at the beginning of each run. It has been 
suggested by Jonsson and coworkers that since the action of the springs is only felt along 



the path the value of the spring constant is not critical as long as it is not zero j2l|. If kgpr is 
set to zero then convergence is guaranteed for the first several tens of iterations only; even 
though gll is projected out, r fluctuates and further optimisation will eventually result in 
the majority of images gradually sliding down to local minima |7l| . 

A. Slow-response Quenched Velocity Verlet 

In practice we find that the value of kspr affects the convergence properties and the sta- 
bility of the optimisation process. This result depends on the type of minimiser employed 
and may also depend on minimiser-specific settings. Here we analyse the convergence prop- 
erties of NEB minimisations using the QVV minimiser (NEB/QVV) and their dependence 
on the type of velocity quenching. From previous work it is not clear when is the best time 



to perform quenching during the MD minimisation of the NEB [71|, [3; 22 1. Since the VV 
algorithm calculates velocities based on the gradients at both current and previous steps 
quenching could be applied using either of these gradients. 

Specifically, it is possible to quench velocities right after advancing the system using 
equation ©, at the half-step in the velocity evaluation (quenching intermediate velocities 
at time t + 6t/2) using either the old or new gradient [equation ((7j)], or after completion of the 
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velocity update. In Figure El we present results for the stability of NEB/QVV as a function 
of the force constant parameter for three of these quenching approaches. We will refer to an 
NEB optimisation as stable for a certain combination of parameters (e.g. time integration 
step, number of images) if the NEB steadily converges to a well-defined path and/or stays in 
its proximity until the maximal number of iterations is reached or the convergence criterion 
is satisfied. 

Figure El shows the results of several thousand optimisations for a 17-image band with the 
Miiller-Brown two-dimensional potential using QVV minimisation and a time step of 
0.01 (consistent units) for different values of kgpr. This widely used surface does not present 
a very challenging or realistic test case, but if an algorithm does not behave well for this 
system it is unlikely to be useful. Each run was started from the initial guess obtained using 
linear interpolation and terminated when the root-mean-square (RMS) gradient became less 
than 0.01. We define the RMS gradient for the NEB as 



where N is the number of images in the band and t] is the number of atomic degrees of 
freedom available to each image. 

It seems natural to remove the velocity component perpendicular to the gradient at the 
current point when the geometry X (t), gradient g (t) and velocity V (t) are available, i.e. 



where Vq (t) is the velocity vector after quenching. However, we found this approach to be 
the least stable of all — the optimisation was slow and convergence was very sensitive to the 
magnitude of time step. Hence we do not show any results for this type of quenching. 

From Figure Efa) we see that the best approach is to quench the velocity after the 
coordinate update. The optimisation is then stable for a wide range of force constant 
values, and the images on the resulting pathway are evenly distributed. In this quenching 
formulation the velocity response to a new gradient direction is retarded by one step in 
coordinate space: the step is still taken in the direction V (t) but the corresponding velocity 
component is removed. To implement this slow-response QVV (SQVV) it is necessary to 
modify the VV algorithm described in section IIII Al by inserting equation ()10|) in between 
the two stages described by equations (jH)) and ((7|). 




(9) 




(10) 
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The second best approach after SQVV is to quench the velocity at midstep t + 5t/2 using 
the new gradient. On average, this algorithm takes twice as long to converge the NEB to 
a given RMS gradient tolerance compared to SQVV. However, the method is stable for the 
same range of spring force constant values and produces a pathway in which the images are 
equispaced more accurately than the other formulations [see Figure Efb)]. 

The least successful of the three QVV schemes considered involves quenching velocities 
at mid-step using the gradient from the previous iteration (stars in Figure Ej). Even though 
the number of iterations required is roughly comparable to that obtained by quenching 
using the new gradient, it has the smallest range of values for the force constant where it 
is stable. Some current implementations of NEB [qtI . losj l (intended for use in combination 
with electronic structure codes) use this type of quenching in their QVV implementation. 

We have also conducted analogous calculations for more complicated systems such as 
permutational rearrangements of Lennard- Jones clusters. The results are omitted for brevity, 
but agree with the conclusions drawn from the simpler 2D model described above. The same 
is true for the choice of force constant investigated in the following section. 



B. Choice of the Force Constant 



We find that if the force constant is too small many more iterations are needed to converge 
the images to the required RMS tolerance, regardless of the type of quenching. In addition, 
the path exhibits a more uneven image distribution. This result occurs because at the 
initial stage the images may have very different gradients from the true potential along the 
band, because they lie far from the required path, and the true potential gradient governs 
the optimisation. When the true RMS force is reduced the springs start to play a more 
important role. But at this stage the forces are small and so is the QVV step size. The 
influence of the springs is actually most important during the initial optimisation stage, for 
it can determine the placement of images in appropriate regions. It is less computationally 
expensive to guide an image into the right region at the beginning of an optimisation than to 
restore the distribution afterwards by dragging it between two minima through a transition 
state region. 

If kspr is too big the NEB never converges to the required RMS gradient tolerance value. 
Instead, it stays in proximity to the path but develops oscillations: adjacent images start 
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to move in opposite directions. For all types of quenching we observed similar behaviour 
when large values of the force constant were used. This problem is related to the step in 
coordinate space that the optimiser is taking: for the SQVV case simply decreasing the time 
step remedies this problem. 



C. Comparison of SQVV and L-BFGS Minimisers for the MB Surface 

We tested the NEB/L-BFGS method by minimising a 17-image NEB for the two- 
dimensional Miiller-Brown surface 196|. Our calculations were carried out using the OPTIM 

n n 

program |99|. The NEB method in its previous formulation [76|| and a modified L-BFGS 
minimiser p| were implemented in OPTIM in a previous discrete path sampling study jsi]. 
We used the same number of images, initial guess and termination criteria as described in 
section UlI Al to make the results directly comparable. 

Figure ID shows the performance of the L-BFGS minimiser as a function of kgpr- We 
used the following additional L-BFGS specific settings. The number of corrections in the 
BFGS update was set to m = 4 (Nocedal's recommendation for the number of corrections 
is 3 ^ m ^ 7 the maximum step size was 0.1, and we limited the step size for each 

image separately, i.e. 

Ip.1^0.1, (11) 

where pj is the step for image j. The diagonal elements of the inverse Hessian were initially 
set to 0.1. 

From Figure ^ it can be seen that the performance of L-BFGS minimisation is relatively 
independent of the choice of force constant. All the optimisations with 30 ^ fc^pr ^ 10, 000 
converged to the steepest-descent path, and, for most of this range, in less than 100 iterations. 
This method therefore gives roughly an order of magnitude improvement in speed over SQVV 
minimisation [see Figure Efa)]. 

We found it helpful to limit the step size while optimising the NEB with the L-BFGS 
minimiser. The magnitude and direction of the gradient on adjacent images can vary sig- 
nificantly. Taking bigger steps can cause the appearance of temporary discontinuities and 
kinks in the NEB. The NEB still converges to the correct path, but it takes a while for these 
features to disappear and the algorithm does not converge any faster. 
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D. Doubly Nudged Elastic Bands 

The NEB/QVV approach has previously been systematically tested on systems with 
around 100 degrees of freedom, r] [22|- However, in the majority of cases these test systems 
could be divided into a 'core' and a smaller part that actually changes significantly. The 
number of active degrees of freedom is therefore significantly smaller than the total number in 
these tests. For example, prototropic tautomerisation of cytosine nucleic acid base [r] = 33) 
involves motion of one hydrogen atom along a quasi-rectilinear trajectory accompanied by 
a much smaller distortion of the core. 

We have therefore tested the performance of the NEB/SQVV and NEB/L-BFGS schemes 
for more complicated rearrangements of Lennard- Jones (LJ) clusters to validate the results 
of ^III Bl and to investigate the stability and performance of both approaches when there are 
more active degrees of freedom. Most of our test cases involve permutational isomerisation of 
the LJ7, LJ38 and LJ75 clusters. These examples include cases with widely varying separation 
between the endpoints, integrated path length, number of active degrees of freedom and 
cooperativity. 

Permutational rearrangements are particularly interesting because it is relatively difficult 
to produce an initial guess for the NEB run. In contrast, linear interpolation between the 
endpoints was found to provide a useful initial guess for a number of simpler cases For 
example, it was successfully used to construct the NEB for rearrangements that involve one 
or two atoms following approximately rectilinear trajectories, and for migration of a single 
atom on a surface For more complex processes an alternative approach adopted in 

previous work is simply to supply a better initial guess 'by hand', e.g. construct it from the 
images with unrelaxed geometries containing no atom overlaps j79|. The 'detour' algorithm 



described in previous calculations that emp 
'atom-crashing' in the initial interpolation 



■oy the ridge method could also be used to avoid 



It has previously been suggested that it is important to eliminate overall rotation and 
translation (ORT) of each image during the optimisation of an NEB jl^. We have imple- 
mented this constraint in the same way as Jonsson et ai, by freezing one atom, restricting 
the motion of a second atom to a plane, and constraining the motion of a third atom to a 
line by zeroing the appropriate components of the NEB gradient. 

We were able to obtain stable convergence in NEB/L-BFGS calculations only for simple 
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rearrangements, which confirms that straightforward L-BFGS optimisation of the NEB is 
unstable [t^. Figure El shows the performance of the NEB/SQVV [FigureEfa)] and NEB/L- 
BFGS [FigureEfb)] approaches for one such rearrangement. These calculations were carried 
out using a 7-image NEB both with (diamonds) and without (stars) removing ORT for 
isomerisation of an LJ7 cluster (global minimum —>■ second- lowest minimum). The number 
of iterations, i, is proportional to the number of gradient evaluations regardless of the type 
of minimiser. Hence, from Figure El we conclude that for this system NEB/L-BFGS is faster 
than NEB/SQVV by approximately two orders of magnitude. However, removal of ORT 
leads to instability in the NEB/L-BFGS optimisation: the images do not stay in proximity 
to the required path for long and instead diverge from it [see inset in Figure Efb)]. 

By experimentation we have found that the main source of the instabilities is the complete 
removal of g"*". Instead, the inclusion of some portion of g-*" in the NEB gradient, i.e. 

gNEB = g^+g"+g*, (12) 

where g* = ^g"*", makes the NEB/L-BFGS calculations stable but introduces some additional 
corner-cutting, as well as an extra parameter, ^. Since we use the transition state candidates 
from NEB as starting points for further EF calculations the corner-cutting is not a drawback 
as long as the transition state candidates are good enough. By adjusting ^ in the range of 
(0.01,0.1) we were able to achieve satisfactory performance for the NEB/L-BFGS method 
in a number of cases. However, an alternative modification, described below, proved to be 
even more successful. 

The drawback of the NEB gradient described by equation (fT^ stems from the interference 
of g-*" and ^g"*", and becomes particularly noticeable when the projection of .^g"*" on g-*- and 
g-*" itself are of comparable magnitude. This problem is analogous to the interference of g 
and g in the original elastic band method, which was previously solved by 'nudging' (7fil |. 
We have therefore constructed the gradient of a new 'doubly' nudged elastic band (DNEB) 
using 

r = - (g^ ■ g^)g^- (13) 

In this formulation some corner-cutting may still occur because the images tend to move 
cooperatively during optimisation; the spring gradient goNEB acting on one image can still 
indirectly interfere with the true gradients of its neighbours. In our calculations this draw- 
back was not an issue, since we are not interested in estimating properties of the path directly 
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from its discrete representation. Instead we construct it from steepest-descent paths calcu- 
lated after converging the transition states tightly using the EF approach. We have found 
DNEB perfectly adequate for this purpose. 

We have also tested a number of approaches that might be useful if one wants to produce a 
full pathway involving a number of transition states for a complicated rearrangement in just 
one NEB run. One of these, for instance, is a gradual removal of the g* component from the 
NEB gradient once some convergence criterion is achieved. This removal works remarkably 
well, particularly in situations with high energy initial guesses, which occur frequently if the 
guessing is fully automated. This adjustment can be thought of as making the band less 
elastic in the beginning in order to resolve the highest-energy transition state regions first. 

E. Comparison of the DNEB/L-BFGS and DNEB/SQVV Methods for 
Permutational Isomerisations of LJy 

It is sometimes hard to make a direct comparison of different double-ended methods for a 
particular rearrangement because the calculations may converge to different paths. Another 
problem concerns the choice of a consistent termination criterion: the RMS force usually 
converges to some finite system-dependent value, which in turn may depend on the number 
of images and other parameters. A low-energy chain of NEB images does not necessarily 
mean that a good pathway has been obtained, since it may arise because more images are 
associated with regions around local minima, rather than the higher energy transition state 
regions. Here we present the results of DNEB/L-BFGS, DNEB/SQVV and, where possible, 
NEB/SQVV calculations for all the distinct permutational rearrangements of the global 
minimum for the LJ7 cluster (see Figure IHl for the endpoints and nomenclature). 

It is possible to draw a firm conclusion as to how well the NEB represents the pathway 
when the corresponding stationary points and steepest-descent paths are already known. 
We therefore base our criterion for the effectiveness of an NEB calculation on whether 
we obtain good estimates of all the transition states. By considering several systems of 
increasing complexity we hope to obtain comparisons that are not specific to a particular 
pathway. 

Connections between two minima are defined by calculating an approximation to the 
two steepest-descent paths that lead downhill from each transition state, and two transition 
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states are considered connected if they are linked to the same minimum via a steepest- 
descent path. We will say that minima are 'connected' if there exists a path consisting of 
one or more transition states and intermediate minima linking them. Permutational isomers 
of the same minimum are distinguished in these calculations. We refer to the chain of images 
produced by the NEB calculation as 'connected' if going downhill from each transition state 
using steepest-descent minimisation yields a set of minima that contains the endpoints linked 
together. 

For NEB/SQVV calculations we used the NEB formulation defined in Ref. H DNEB 
is different from the above method because it includes an additional component in the 
NEB gradient, as described by equations ()12|1 and (fT!?|) . In addition, for the following 
DNEB calculations we did not remove overall rotation and translation (ORT), because we 
believe it is unnecessary when our gradient modification is used. To converge transition 
state candidates tightly we employed EE optimisation, limiting the maximum number of 
EE iterations to five with an RMS force convergence tolerance of 10~^. (Standard reduced 
units for the Lennard- Jones potential are used throughout this paper.) Initial guesses for 
all the following calculations were obtained by linear interpolation between the endpoints. 
To prevent 'atom-crashing' from causing overflow in the initial guess we simply perturbed 
such images slightly using random atomic displacements of order 10~^ reduced units. 

In each case we first minimised the Euclidean distance between the endpoints with re- 



spect to overall rotation and translation using the method described in Ref. 110(1 SQVV 
minimisation was performed with a time step of 0.01 and a maximum step size per degree of 
freedom of 0.01. This limit on the step size prevents the band from becoming 'discontinuous' 
initially and plays an important role only during the first 100 or so iterations. The limit 
was necessary because for the cases when the endpoints are permutational isomers linear 
interpolation usually yields bands with large gradients, and it is better to refrain from taking 
excessive steps at this stage. We did not try to select low energy initial guesses for each 
rearrangement individually, since one of our primary concerns was to automate this process. 
For the same reason, all the L-BFGS optimisations were started from guesses preoptimised 
using SQVV until the RMS force dropped below 2.0. 

Table |l] shows the minimum number of images and gradient calls required to produce 
a connected pathway using the DNEB/L-BFGS and DNEB/SQVV methods. These cal- 
culations were run assuming no prior knowledge of the path. Normally there is no initial 
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information available on the integrated path length or the number of intermediate minima 
between the endpoints, and it takes some experimentation to select an appropriate number 
of images. Our strategy is therefore to gradually increase the number of images to make 
the problem as computationally inexpensive as possible. Hence we increment the number of 
images and maximum number of NEB iterations in each calculation until a connected path 
is produced, in the sense defined above. The permitted image range was 2 < < 20 and 
the maximum number of NEB iterations ranged from 1 < / < 3, 000 A^. We were unable to 
obtain connected pathways for any of the four LJ7 rearrangements using the NEB/SQVV 
approach. 

Table |n] presents the results of analogous calculations where we keep the number of im- 
ages fixed to 50. Unlike the performance comparison where the number of images is kept to 
a minimum (Table H}, these results should provide insight into the performance of the DNEB 
approach when there are sufficient images to resolve all the transition states. All the opti- 
misations for a particular rearrangement converged to the same or an enantiomeric pathway 
unless stated otherwise. The energy profiles that correspond to these rearrangements are 
shown in Figure 

From Table |l] and |n] we conclude that in all cases the DNEB/L-BFGS approach is 
more than an order of magnitude faster than DNEB/SQVV. It is also noteworthy that the 
DNEB/SQVV approach is faster than NEB/SQVV because overall rotation and translation 
are not removed. Allowing the images to rotate or translate freely can lead to numerical 
problems, namely a vanishing norm for the tangent vector, when the image density is very 
large or the spring force constant is too small. However, when overall rotation and transla- 
tion are not allowed there is less scope for improving a bad initial guess, because the images 
are more constrained. This constraint usually means that more images are needed or a 
better initial guess is required. Our experience is that such constraints usually slow down 
convergence, depending on which degrees of freedom are frozen: if these are active degrees of 
freedom (see above) the whole cluster must move instead, which is usually a slow, concerted 
multi-step process. 
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F. A revised connection algorithm 

In previous work we have used the NEB approach to supply transition state guesses 
for further EF refinement [sil . Is^ . Double-ended searches are needed in these discrete path 
sampling runs to produce alternative minimum-transition state-minimum- ■ ■ sequences from 
an initial path. The end minima that must be linked in such calculations may be separated by 
relatively large distances, and a detailed algorithm was described for building up a connected 
path using successive transition state searches. The performance of the DNEB/L-BFGS 
approach is sufficiently good that we have changed this connection strategy in our OPTIM 
program. In particular, the DNEB/L-BFGS method can often provide good candidates for 
more than one transition state at a time, and may even produce all the necessary transition 
states on a long path. However, it is still generally necessary to consider multiple searches 
between different minima in order to connect a pair of endpoints. In particular, we would like 
to use the minimum number of NEB images possible for reasons of efficiency, but automate 
the procedure so that it eventually succeeds or gives up after an appropriate effort for any pair 
of minima that may arise in a discrete path sampling run. These calculations may involve 
the construction of many thousands of discrete paths. As in previous work we converge 
the NEB transition state candidates using eigenvector-following techniques and then use 
L-BFGS energy minimisation to calculate approximate steepest-descent paths. These paths 



usually lead to local minima, which we a. 



so converge tightly. The combination of NEB and 



3, Q 



is similar to using NEB with a 'climbing 



hybrid eigenvector-followingtechniques 
image' as described in Ref. 

The initial parameters assigned to each DNEB run are the number of images and the 
number of iterations, which we specify by image and iteration densities. The iteration 
density is the maximum number of iterations per image, while the image density is the 
maximum number of images per unit distance. The distance in question is the Euclidean 
separation of the endpoints, which provides a crude estimation of the integrated path length. 
This approach is based on the idea that knowing the integrated path length, which means 
knowing the answer before we start, we could have initiated each DNEB run with the same 
number of images per unit of distance along the path. In general it is also impossible to 
provide a lower bound on the number of images necessary to fully resolve the path, since 
this would require prior knowledge of the number of intervening stationary points. Our 



18 



experience suggests that a good strategy is to employ as small an image and iteration 
density as possible at the start of a run, and only increase these parameters for connections 
that fail. 

All NEB images, i, for which Ei > Ei±i are considered for further EF refinement. The 
resulting distinct transition states are stored in a database and the corresponding energy 
minimised paths were used to identify the minima that they connect. New minima are also 
stored in a database, while for known minima new connections are recorded. Consecutive 
DNEB runs aim to build up a connected path by progressively filling in connections between 
the endpoints or intermediate minima to which they are connected. This is an advantageous 
strategy because the linear interpolation guesses usually become better as the separation 
decreases, and therefore fewer optimisation steps are required. Working with sections of a 
long path one at a time is beneficial because it allows the algorithm to increase the resolution 
only where it is needed. Our experience is that this approach is generally significantly faster 
than trying to characterise the whole of a complex path with a single chain of images. 

When an overall path is built up using successive DNEB searches we must select the two 
endpoints for each new search from the database of known minima. It is possible to base 



this choice on the order in which t 
strategy used in our previous work 



uhe transition states were found, which is basically the 
8l[ Is^ . However, when combined with the new DNEB 
approach a better strategy is to connect minima based upon their Euclidean separation. 
For this purpose it is convenient to classify all the minima into those already connected to 
the starting endpoint (the S set), the final endpoint (the F set), and the remaining minima, 
which are not connected to either endpoint (the U set). The endpoints for the next DNEB 
search are then chosen as the two that are separated by the shortest distance, where one 
belongs to S or F, and the other belongs to a different set. The distance between these 
endpoints is then minimised with respect to overall rotation and translation, and an initial 
guess for the image positions is obtained using linear interpolation. Further details of the 
implementation of this algorithm and the OPTIM program are available from authors upon 
request. 

As test cases for this algorithm we have considered various degenerate rearrangements 
of LJ7, LJ13, LJ38 and LJ75. (A degen erate rearrangement is one that links permutational 

isomers of the same structure Isl. llOll .) T he PES's of LJ38 and LJ75 have been analysed 

in a number of previous studies and are known to exhibit a double- 
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funnel morphology: for both clusters the two lowest-energy minima are structurally distinct 
and well separated in configuration space. This makes them useful benchmarks for the 
above connection algorithm. Figure |H1 depicts the energy profiles obtained using the revised 
connection algorithm for rearrangements between the two lowest minima of each cluster. In 
each case we have considered two distinct paths that link different permutational isomers 
of the minima in question, and these were chosen to be the permutations that give the 
shortest Euclidean distances. These paths will be identified using the distance between the 
two endpoints; for example, in the case of LJ38 we have paths LJ38 3.274 a and LJ38 3.956 cr, 
where is the pair equilibrium separation for the LJ potential. 

For each calculation we used the following settings: the initial image density was set to 
10 and the iteration density to 30. If a connection failed for a particular pair of minima 
then up to two more attempts were allowed before moving to the pair with the next smallest 
separation. For the second and third attempts the number of images was increased by 50% 
each time. The maximum number of EF optimisation steps was set to 30 with RMS force 
convergence criterion of 10~^. In Figure IHlevery panel is labelled with the separation between 
the endpoints, the number of transition states in the final pathway, the number of DNEB 
runs required, and the total number of gradient calls. 

Individual pathways involving a single transition state have been characterised using 
indices such as 



(EJX,(5)- 






-^t{F)t 



which is a measure of the number of atoms that participate in the rearrangement. Here 
Xi(S') and Xf(F) are the position vectors of atom t in the starting and finishing geometries, 
respectively. The largest values are marked in Figure |H1 next to the corresponding transition 
state. It is noteworthy that the pathways LJ38 3.956(j and LJ75 4.071cr both involve some 
highly cooperative steps, and the average value of is more than 12 for both of them. 

We have found that it is usually easier to locate good transition state candidates for a 
multi-step path if the stationary points are separated by roughly equal distances, in terms of 
the integrated path length. Furthermore, it seems that more effort is needed to characterise 
a multi-step path when transition states involving very different path lengths are present. 
In such cases it is particularly beneficial to build up a complete path in stages. To further 
characterise this effect we introduce a path length asymmetry index tt defined as tt = |s+ — 
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+ where s+ and s_ are the two integrated path lengths corresponding to the two 
downhill steepest-descent paths from a given transition state. For example, in rearrangement 
LJ38 3.956cr, five steps out of nine have vr > 0.5. 

Barrier asymmetry also plays a role in the accuracy of the tangent estimate, the image 
density required to resolve particular regions of the path, and in our selection process for 
transition state candidates, which is based on the condition Ei > Ei±i. To characterise this 
property we define a barrier asymmetry index, (3, as (3 = \E^ — E_\/max where 
Ej^ and E_ are the barriers corresponding to the forward and reverse reactions, respectively. 
The test cases in Figure |H1 include a variety of situations, with barrier asymmetry index 
j3 ranging from 0.004 to 1.000. The maximum values of vr and (3 are shown next to the 
corresponding transition states in this Figure. 

We note that the total number of gradient evaluations required to produce the above 
paths could be reduced significantly by optimising the DNEB parameters or the connection 
strategy in each case. However, our objective was to find parameters that give reasonable 
results for a range of test cases, without further intervention. 

IV. CONCLUSIONS 

The most important result of this work is probably the doubly nudged elastic band 
(DNEB) formulation, in which a portion of the spring gradient perpendicular to the path 
is retained. With this modification we found that L-BFGS minimisation of the images is 
stable, thus providing a significant improvement in efficiency. Constraints such as elimination 
of overall rotation and translation are not required, and the DNEB/L-BFGS method has 
proved to be reliable for relatively complicated cooperative rearrangements in a number of 
clusters. 

In comparing the performance of the L-BFGS and quenched velocity Verlet (QVV) meth- 
ods for optimising chains of images we have also investigated a number of alternative QVV 
schemes. We found that the best approach is to quench the velocity after the coordinate 
update, so that the velocity response to the new gradient lags one step behind the coordi- 
nate updates. However, this slow-response QVV (SQVV) method does not appear to be 
competitive with L-BFGS. 

Finally, we have revised our previous scheme for constructing connections between distant 
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minima using multiple transition state searches. Previously we have used an NEB/L-BFGS 
framework for this purpose, with eigenvector-following refinement of transition state candi- 
dates and characterisation of the connected minima using energy minimised approximations 
to the steepest-descent paths [sil Is^. When the DNEB/L-BFGS approach is used we have 
found that it is better to spend more effort in the DNEB phase of the calculation, since 
a number of good transition state guesses can often be obtained even when the number of 
images is relatively small. In favourable cases a complete path linking the required endpoints 
may be obtained in one cycle. Of course, this was always the objective of the NEB approach 
|2l[0,0,Ql; but we have not been able to achieve such results reliably for complex paths 
without the current modifications. When a number of transition states are involved we still 
find it more efficient to build up the overall path in stages, choosing endpoints that be- 
come progressively closer in space. This procedure has been entirely automated within the 
OPTIM program, which can routinely locate complete paths for highly cooperative multi- 
step rearrangements, such as those connecting different morphologies of the LJ38 and LJ75 
clusters. 
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VI. TABLES 



TABLE I: The minimal number of images and total number of gradient calls (in parentheses) 
are shown for degenerate rearrangements of LJ7. The image range was 2 < A'' < 20 and the 
iteration range was 1 < £ < 3, OOOA?^. Each SQW calculation was started from the guess produced 
using linear interpolation, while guesses for L-BFGS runs were preoptimised using DNEB/SQVV 
until the RMS force dropped below 2.0. Every iteration the images that satisfy Ei > E'iii were 
optimised further using eigenvector-following. The transition state candidates that converged to a 
true transition state within five iterations were used to generate the connected minima using energy 
minimisation. If this procedure yielded a connected pathway the calculation was terminated and 
the rest of the parameter range was not explored. Otherwise, the number of images was incremented 
and the procedure repeated. The number of gradient calls is a product of the number of images 
and the total number of iterations. For the L-BFGS calculations the number of iterations includes 
the SQW preoptimisation steps (100 on average) and the actual number of L-BFGS steps. Dashes 
signify cases where we were unable to obtain a connected pathway. 



Method 


1-2 


2-3 3-4 4-5 


DNEB/L-BFGS 


5(1720) 


18(30276) 11(2486) 18(8010) 


DNEB/SQVV 


16(21648) 


10(14310) 
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TABLE II: The minimal number of iterations needed to produce connected pathways for four 
degenerate rearrangements of LJ7 using a 50-image NEB. The strategy of this calculation is identical 
to the one described in the caption to Table ^ except that the number of images was fixed. 

Method 1-2 2-3 3-4 4-5 

DNEB/L-BFGS 131" 493 171 326 
DNEB/SQVV 1130 15178 2777 23405^ 
NEB/SQVV 11088^ - 30627 - 

" The number of iterations is the sum of the SQVV preoptimisation steps (100 on average) and 

the actual number of iterations needed by L-BFGS minimiser. ^ This value is not directly 
comparable since DNEB converged to a different path that contains more intermediate minima. 
Dashes signify cases where we were unable to obtain a connected pathway. 
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VII. FIGURE CAPTIONS 

1. Graphical representation of the nudged elastic band approach, (a) The optimised 
nudged elastic band for a two-dimensional model surface. The band contains 21 images 
and connects two minima Xq and X23. Image Xg has the highest energy and might 
therefore be used to estimate transition state properties or as a starting guess for 
further refinement, (b) 'Nudging': the NEB depicted in (a) is projected onto the 
xy plane and feels only the perpendicular component of the true gradient from the 
effective potential V^. 

2. Details of recent NEB implementations, (a) Conditions under which kinks appear 
during optimisation of the NEB using the tangent estimated from the line segment r 
connecting images i + 1 and z — 1. Displacement of image i from the path (dash-dotted 
line) creates forces = — g^^i and = — g,^. While F^ is a restoring force that 
originates from V-^^ 'Pf'-i is destabilising and originates from (and is non-zero due to 
the fact that the tangent at image i — 1 has changed after displacement of image i). For 
the case of small displacements the potential may be resolved into two contributions, 

= {k^/2)x'^ and V^" = —k^^y, and kinks will not appear if fc" /k-^ < \ f\. (b) Tangent 
estimate using the higher energy neighbour: image i + 1 is 'hanging' on to image i. 
The separation d is controlled by the lower-lying images {> i + 1) but not V. (c) An 
NEB that follows the curved region of the path: since the spring force Fi acting on 
image i is compensated by projection F2, the distribution of images becomes uneven, 
(d) Corner-cutting displayed on a cross-section of the curved part of the path depicted 
in (c): the image is displaced from the path due to the presence of F^^. 

3. (a) Number of iterations, i, and (b) average deviation from the average image sepa- 
ration, q, as a function of the spring force constant, kgp^, obtained using a 17-image 
NEB on the Miiller-Brown surface 96| . Minimisation was performed using QVV with 
time step 0.01 and RMS force termination criterion 0.01. The number of iterations is 
shown for velocity quenching after the coordinate update (diamonds), after the gradi- 
ent evaluation (squares) and at the half-step through the velocities update (stars). 

4. (a) Number of iterations, i, and (b) average deviation from average image separation, 

as a function of the spring force constant, kgpr, obtained using a 17-image NEB 
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for the Miiller-Brown surface Minimisation was performed using L-BFGS with 
number of corrections m = 4, maximum step size 0.1 and RMS force termination 
criterion 0.01. 

5. RMS gradient S'rms ^ function of iteration number L A 7- image NEB was used 
to model an isomerisation path in the LJ7 cluster (global minimum — second-lowest 
minimum). Minimisation was performed using the SQVV (a) and L-BFGS (b) meth- 
ods. Results are shown for minimisations with and without removing overall rotation 
and translation (diamonds and stars, respectively). The inset in (a) depicts the aver- 
age deviation from the average image separation, as a function of iteration number 
for minimisations using SQVV, while the inset in (b) shows g^ius recorded for 1000 
iterations of L-BFGS minimisations. These calculations were all continued for a fixed 
number of iterations, regardless of convergence. 

6. Structures of the most stable isomers for (a) LJ7, (b) LJ38 and (c) LJ75 clusters, which 
were used as endpoints in the NEB calculations. The first endpoint was the global 
minimum in each case. For LJ38 and LJ75 the second endpoint was chosen to be second- 
lowest minimum shown on the right in parts (b) and (c), respectively, while a permuta- 
tional isomer of the global minimum was used as the second endpoint in all the LJ7 cal- 
culations. The notation 1-2 denotes an LJ7 rearrangement where the second endpoint 
is structure (a) with atoms 1 and 2 swapped. The structures and numbering employed 
for LJ38 and LJ75 are defined at |http:/ /www- wales. ch.cam.ac.uk/^sat3 9/DNEBtests/( 

7. The energy, as a function of the integrated path length, s, for four degenerate 
rearrangements of LJ7. These profiles were constructed using energy minimisation to 
characterise the paths connected to transition states obtained by EF refinement of 
candidate structures obtained from DNEB calculations [s^. 

8. The energy, E, as a function of integrated path length, s, for pathways linking the two 
lowest minima of LJ38 and LJ75. Calculations were initiated between two different sets 
of permutational isomers of these minima. For each profile the number of transition 
states, Nt, number of DNEB runs, Nc-, and the total number of gradient calls, Ng, 
are shown. Maximum values of N , (3 and vr are marked next to the corresponding 
transition states. The endpoints were illustrated in Figure El 
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